|
Disambiguation of domain word segmentation based on unsupervised learning
XIU Chi SONG Rou
Journal of Computer Applications
2013, 33 (03):
780-783.
DOI: 10.3724/SP.J.1087.2013.00780
Domain word segmentation is much more difficult than general word segmentation in Chinese natural language processing. The segmentation ambiguity has been lack of effective solution especially. Concerning this problem, an unsupervised learning method for domain segmentation ambiguity was proposed. String frequency, mutual information and boundary entropy were selected as evaluation standard for segmentation ambiguity. Individual and combination of these three kinds of information were used to solve the problem. The experimental results suggest that the proposed can solve the domain segmentation ambiguity efficiently and effectively.
Reference |
Related Articles |
Metrics
|
|